AITopics | language 0

Collaborating Authors

language 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Feedback Forensics: A Toolkit to Measure AI Personality

Findeis, Arduin, Kaufmann, Timo, Hüllermeier, Eyke, Mullins, Robert

arXiv.org Artificial IntelligenceOct-1-2025

Some traits making a "good" AI model are hard to describe upfront. For example, should responses be more polite or more casual? Such traits are sometimes summarized as model character or personality. Without a clear objective, conventional benchmarks based on automatic validation struggle to measure such traits. Evaluation methods using human feedback such as Chatbot Arena have emerged as a popular alternative. These methods infer "better" personality and other desirable traits implicitly by ranking multiple model responses relative to each other. Recent issues with model releases highlight limitations of these existing opaque evaluation approaches: a major model was rolled back over sycophantic personality issues, models were observed overfitting to such feedback-based leaderboards. Despite these known issues, limited public tooling exists to explicitly evaluate model personality. We introduce Feedback Forensics: an open-source toolkit to track AI personality changes, both those encouraged by human (or AI) feedback, and those exhibited across AI models trained and evaluated on such feedback. Leveraging AI annotators, our toolkit enables investigating personality via Python API and browser app. We demonstrate the toolkit's usefulness in two steps: (A) first we analyse the personality traits encouraged in popular human feedback datasets including Chatbot Arena, MultiPref and PRISM; and (B) then use our toolkit to analyse how much popular models exhibit such traits. We release (1) our Feedback Forensics toolkit alongside (2) a web app tracking AI personality in popular models and feedback datasets as well as (3) the underlying annotation data at https://github.com/rdnfn/feedback-forensics.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2509.26305

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Brain regions

Neural Information Processing SystemsAug-16-2025, 02:08:57 GMT

A system of regions (also referred to as a network) can comprise multiple disjoint regions that exhibit shared activity patterns across a range of tasks. The auditory system is located in the superior temporal region of the brain. The voxels were then filtered using gray-matter masking and (for MD and the Language systems) network localization. See Fedorenko et al. [2010] for a discussion of the functional localization approach as it pertains to the language network. For each brain system and each code property or code model, we run a separate MVP A analysis.

brain region, code model, code property, (17 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Software > Programming Languages (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.47)

Add feedback

Evaluating AI-Generated Essays with GRE Analytical Writing Assessment

Zhong, Yang, Hao, Jiangang, Fauss, Michael, Li, Chen, Wang, Yuan

arXiv.org Artificial IntelligenceNov-12-2024

The recent revolutionary advance in generative AI enables the generation of realistic and coherent texts by large language models (LLMs). Despite many existing evaluation metrics on the quality of the generated texts, there is still a lack of rigorous assessment of how well LLMs perform in complex and demanding writing assessments. This study examines essays generated by ten leading LLMs for the analytical writing assessment of the Graduate Record Exam (GRE). We assessed these essays using both human raters and the e-rater automated scoring engine as used in the GRE scoring pipeline. Notably, the top-performing Gemini and GPT-4o received an average score of 4.78 and 4.67, respectively, falling between "generally thoughtful, well-developed analysis of the issue and conveys meaning clearly" and "presents a competent analysis of the issue and conveys meaning with acceptable clarity" according to the GRE scoring guideline. We also evaluated the detection accuracy of these essays, with detectors trained on essays generated by the same and different LLMs.

ai-generated essay, llm, similarity, (17 more...)

arXiv.org Artificial Intelligence

2410.17439

Country:

Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre:

Research Report > New Finding (0.69)
Research Report > Experimental Study (0.67)

Industry: Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)

Add feedback

The LLM Language Network: A Neuroscientific Approach for Identifying Causally Task-Relevant Units

AlKhamissi, Badr, Tuckute, Greta, Bosselut, Antoine, Schrimpf, Martin

arXiv.org Artificial IntelligenceNov-4-2024

Large language models (LLMs) exhibit remarkable capabilities on not just language tasks, but also various tasks that are not linguistic in nature, such as logical reasoning and social inference. In the human brain, neuroscience has identified a core language system that selectively and causally supports language processing. We here ask whether similar specialization for language emerges in LLMs. We identify language-selective units within 18 popular LLMs, using the same localization approach that is used in neuroscience. We then establish the causal role of these units by demonstrating that ablating LLM language-selective units -- but not random units -- leads to drastic deficits in language tasks. Correspondingly, language-selective LLM units are more aligned to brain recordings from the human language system than random units. Finally, we investigate whether our localization method extends to other cognitive domains: while we find specialized networks in some LLMs for reasoning and social capabilities, there are substantial differences among models. These findings provide functional and causal evidence for specialization in large language models, and highlight parallels with the functional organization in the brain.

language 0, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2411.0228

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback